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Abstract 

We propose the n-clique network as a powerful tool for understanding global struc- 
tures of combined highly-interconnected subgraphs, and provide theoretical pre- 
dictions for statistical properties of the n-clique networks embedded in a complex 
network using the degree distribution and the clustering spectrum. Furthermore, us- 
ing our theoretical predictions, we find that the statistical properties are invariant 
between 3-clique networks and original networks for several observable real-world 
networks with the scale-free connectivity and the hierarchical modularity. The re- 
sult implies that structural properties are identical between the 3-clique networks 
and the original networks. 
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1 Introduction 



Cliques are highly- interconnected subgraphs (complete graphs), and appear 
dominantly in networks which describe wide-ranging complex systems occur- 
ring from the level of cells to society. And, the cliques are actively investigated 
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in recent years because of provisions of important insights to information pro- 
cessing, hierarchical modularity, and community structures. For instance, in 
gene regulatory networks, small cliques correspond to the feed-forward loop 
which is one of the network motifs (H). The motifs play an important role in 
gene regulation (jsj), and are regarded as building blocks of life. Furthermore, 
the cliques are a representation for clusters, communities, and groups (0; 0) 
because there are edges among persons as nodes if there are friendships, part- 
nerships, and etc. among the persons in social networks. Therefore, the cliques 
help to detect community structures (0) in social networks. Again, in protein- 
protein interaction networks, the cliques are powerful tools for understanding 
evolution of proteins and functional predictions of proteins having unknown 
function (0) because proteins which have same functions tend to interact. 



Motivated by these breakthroughs, recent efforts have taken place to analyti- 
cally evaluate the abundance of subgraphs, including cliques, based on statisti- 
cal mechanics (jilQ), providing excellent knowledge about the local interaction 
patterns (p) and the time evolution of the abundance of subgraphs including 
cliques (|9|). These previous works focus on the local information such as the 
subgraph and clique abundance, and the size of the giant components led by 
percolation via a class of subgraphs such as the subgraph percolation M), the 



L— percolation (jlOl ). and the clique percolations (11 if ). In recent years, however, 
it has been revealed that real-world networks are constructed by overlapping 
subgraphs including cliques (Js|; 0); thus it is important to elucidate global 
structures in networks consisting of cliques. For example, dynamics of a high 
order emerge by the combined network motifs in gene regulatory networks 
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In particular, the several power-law statistical properties have been empiri- 
cally found in real-world complex networks. One of the properties is scale-free 



connectivity ()12f ) which is characterized by a power-law degree distribution 
P(k) ~ k" 1 with 2 < 7 < 3 empirically found (Il3f ). The scale-free connectiv- 
ity means that a few nodes (hubs) integrate a great number of nodes and most 
of the remaining nodes do not. Another of the properties is hierarchical mod- 
ularity which is characterized by a power-law clustering spectrum C(k) ~ k~ a 
with a ~ 1 empirically found, and this property suggests a hierarchical struc- 



ture of the cliques (1141 ; Il5l ). A clustering spectrum is defined as an average 
clustering coefficient of nodes with degree k, where the clustering coefficient 
means the density of edges among neighbors of a node. Since these properties 
reflect a global structure of a network, it is significant to clarify relationships 
between these properties and the global structures of the combined cliques. 

In this paper, we propose the n-clique network as a powerful tool for un- 
derstanding global structures of combined highly-interconnected subgraphs. 
Furthermore, we provide the theoretical predictions for well-known statisti- 
cal properties of n-clique networks embedded in a complex network using the 
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degree distribution and the clustering spectrum, and evaluate our theoret- 
ical predictions with numerical simulations. The theoretical predictions are 
established by applying the statistical method in (0). Moreover, we discuss 
relationships of statistical properties which are observed between several real- 
world networks and their n-clique networks. 



2 n-clique networks 



n-clique networks are represented as sets of nodes and edges which are con- 
tained in n-node cliques, corresponding to n-node complete graphs, embedded 
in an original network. Figure [1] shows a schematic diagram of n-clique net- 
works. The original network [Fig. [1] (a)] has two clique networks [Figs. [I] (b) 
and (c)], and the clique networks are expressed as the circled black nodes with 
black edges. The gray nodes and edges are eliminated because the nodes and 
edges are affiliated with no cliques. Following a procedure, n-clique networks 
are extracted from an original network. In addition, original networks are 
equivalent to 2-clique networks in the absence of isolated nodes corresponding 
to nodes which have no edges. In this paper, we assume that the original net- 
works have no isolated nodes. We utilize the algorithm based on the network 
motif detection ([]]) to find the cliques Although finding clique abundance is 
computationally intractable (NP-hard), enumeration of n-cliques in a given 
network can be done in polynomial time if n is a constant (Il6l ). 



3 Degree distribution 



We consider degree distributions from n-clique networks P(k^). The degree 
distribution is defined as the existence probability of nodes with degree k^ 
which is the number of edges at a node in a n-clique network. In addition, 
P(k^) denotes the degree distribution P(k) from an original network because 
k& — k. 




(a) original network (b) 3-clique network (c) 4-clique network 



Fig. 1. Schematic diagram of n-clique networks embedded in the original network 
(a). The n-clique networks [(b) and (c)] are expressed as the circled black nodes 
with black edges. 
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Fig. 2. Degree distributions of n-clique networks embedded in the BA network with 
N = 3,000 and (k) = 16 (shifted for clarity), (k) means the average degree. The 
symbols correspond to the numerical results, and the dashed lines are theoretical 
predictions given by Eq. ([3]). The solid lines show P(k) oc k~ 3 . 



In order to establish a theoretical prediction on the degree distribution of n- 
clique networks, we propose an approximation method based on the statistical 
method in (0). We assume that the clustering spectrum C(k) corresponds to 
the probability that two neighbors of a node with degree k (> 2) are linked. 
First, we consider the probability (f> n (k) that an edge on a node with degree k is 
eliminated due to the extraction of n-clique networks from an original network. 
For simplicity, we assume that the probability of an edge to be eliminated from 
a node is independent from the probability of another edge to be eliminated 
from the same node and the probability of the same edge to be eliminated 
from a neighbor. This assumption is a suitable approximation in the case of 



random graphs (1171 ) because the probability that there is an edge between 
two nodes is constant. We show that the approximation is also suitable in 
the case of arbitrary large-scale graphs (networks) for large k with numerical 
simulations. Here, we focus on a subset which consists a node with degree k, 
neighboring nodes and edges among these nodes. Then, the edge on the node 
with degree k belongs to ( *~*) n-cliques which are formed with the probability 
C(k) np , where n p = (n — l)(n — 2)/2. That is, the probability that the edge 
is not contained in one of (tZ^) ^-cliques is [1 —C(k) rip }. Since the edge is 
eliminated if the edge is contained in no n-cliques, from the assimptation of 
independence, the probability 4> n {k) can be written as 

<f> n (k) = {l-C{k) n *}&3. (1) 



Next, we characterize the conditional probability that the degree shifts from k 
to k^ due to the extraction of n-clique networks using the probability <p n (k). 
The conditional probability can be expressed using the bimodal formula, and 
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we have 



[i-Mk)f n) MkY k - k(n)) . 



The degree distribution from an n-clique network P(k^) is proportional to 
the sum of P(k)$ n (k {n) \k) for k = k {n \ fc (n) + 1, ■ • • , k max . Therefore, the 
degree distribution is finally described as 

AT kmax 

P(k {n) ) = W E P(k)Mk {n) \k), (3) 
iV ™ fe=fe(«) 



where N and AT n correspond to the total number of nodes in an original 
network and in a n-clique network, respectively. Using P(k) and C(k), the 
total number of nodes in the n-clique network can be estimated by 



kmax 

N £ P(k) 

k=n—l 



{1 - C(k) np }( r - 



(4) 



In order to confirm the theoretical predictions, we performed numerical simu- 
lations for the Barabasi- Albert (BA) network (1201 ) . which provides power-law 
degree distribution; P(k) ~ k^ with the degree exponent 7 = 3. Figure [2] 
shows the degree distributions of n-clique networks embedded in the BA net- 
work. As shown in Fig. [2J our theoretical predictions are in good agreement 
with the numerical results, indicating that the approximation is suitable. In 
addition, the different degree distributions are observed between the n-clique 
networks and the original network. 



4 Shift of the degree 



The degree at a node shifts due to the extraction n-clique networks from an 
original network. Here, we consider the theoretical predictions for the shifts 
with the statistical properties from an original network. Using the probability 
4> n (k) [Eq. ([T])] that an edge is eliminated due to the extraction of n-clique 
networks, the expectation value of the degree at a node in a n-clique network 
can be written as 

fcW = k [1 - Mk)] ■ (5) 



The probability </> n (k) is dependent on the clustering spectrum C{k) as shown 
in Eq. (JTJ). Since it is empirically found that the spectrum follows the power 
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Fig. 3. Shift of the degree at a node due to the extraction of n-clique networks from 
the BA network with N = 3, 000 and (k) = 16. The symbols correspond to the 
numerical results, and the dashed lines are given by Eq. ([5]). The solid lines show 
fcW oc k. 



law in most complex networks (114 ). we assume the power- law spectrum; hence 
C(k) = Cok~ a . Moreover, we use the feature of Napier's number, e~ c = (1 — 
c/k) k for large k, to rewrite the probability (f> n (k) [Eq. (pQ)]. In doing such we 
have 



4> n (k) = exp 



[n 



2)! 



(6) 



where 

C„ = n - n p a - 2. (7) 



In particular, the probability <f> n (k) is independent of the degree k when ( n = 0, 
and the proportional relationship between k^ n ' and k is satisfied. 

In order to confirm the theoretical prediction, we performed numerical simula- 
tions for the BA network. Figure [3] shows the shift of the degree at a node due 
to the extraction of the n-clique networks. As shown in Fig. [3J our theoretical 
prediction is in good agreement with the numerical results. Figure H] shows the 
probability <p n (k) which is obtained from the extraction of n-clique networks. 
Assume that C(k) = Cok~ a , Cq and a are about 0.02 and 0.1 with least-square 
method, respectively. We give the theoretical prediction with these values. As 
shown in Eq. <pk declines exponentially with k, indicating that a degree 
of a high-degree node tends to stay. The prediction is in agreement with the 
numerical results. 

In the case of n = 4, however, the agreements are weak in Fig. [3] and HI 
There are two reasons. One is the assumption of independence. In scale-free 
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Fig. 4. Probability (j) n (k) is obtained from the extraction of n-clique networks from 
the BA network with N = 3, 000 and (k) = 16. The symbols correspond to the 
numerical results, and the dashed lines are given by Eq. ©. 



Table 1 

Network sizes, average degrees, and characteristic exponents of the investigated real- 
world networks and the BA network. The exponents 7 and a are extracted using 



the maximum likelihood estimation (|22l ) and the analytical approximation (8); thus 



C(k) = C /{1 + {k/k ) a }, respectively 



Network N (k) 7 a Ref. 

Internet (AS level) 7,832 4.38 2.4 0.75 (23) 

Metabolic (E. coli) 1,273 2.15 3.0 1.0 (24) 

Protein interaction (Yeast) 1,485 2.62 2.2 1.3 (25) 

Barabasi-Albert 3,000 16.0 3.0 0.0 (20) 



network, low-degree nodes tend to connect to high-degree nodes. As shown in 
Fig. HI the probability that an edge on the high-degree node is eliminated is 
very small. For this reason, real 4> n {k) for small k tends to be smaller than 
Eq. (jf)]). Therefore, real few tends to be larger than our theoretical prediction. 
Another is fluctuation in clustering spectra C(k). In the case of scale-free 
networks, the fluctuation is large for small k, and is contrary small for large k 
because of heterogeneous connectivity. And, the probability that a n-clique is 
formed is described as C(k) Up . That is, the error increases with n p . Therefore, 
our theoretical prediction tends to be in weak agreement in the case of large 
n and small k. 



The clustering spectrum of the BA network is independent of the degree k 
(12 1| ). That is, a ~ 0. According to Eq. (|7j), we predict that the shifts of the 
degree follow the nonlinear relationship because of the nonzero ( n ; for example, 
(^3 = 3 — 2 = 1 and (4 = 4 — 2 = 2. As shown in Fig. [3j our prediction is in 
agreement with the numerical results. 
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Fig. 5. Degree distributions of n-clique networks embedded in the investigated net- 
works (shifted for clarity). The solid lines show oc A; -7 in the each main panel. The 
exponents 7 are provided from Table [H respectively. The each inset shows the shift 
of the degree due to the extraction of 3-clique network. In the each inset, the solid 
lines correspond to oc k. (a) Internet (AS level), (b) Metabolic network of E. coli, 
and (c) protein-protein interaction network of yeast. 



5 Invariance of statistical property 



We discuss statistical properties of n-clique networks embedded in a network 
with power-law statistical properties. Here, we focus on the scale-free con- 
nectivity which is one of the well-known power-law statistical properties and 
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is defined as a power-law degree distribution, P(k) ~ k" 1 '. In networks with 
scale-free connectivity, we predict that forms of the degree distributions are 
invariant between the 3-clique and the original network when ( n = 0. This 
is because the proportional relationship between the degrees at nodes in the 
original network and in the n-clique networks, is satisfied under this condition. 

In order to verify our prediction, we investigate the degree distributions from 
n-clique networks embedded in several real-world networks with scale-free 



connectivity: the autonomous system representation of the Internet (23), the 



metabolic network of Escherichia coli (124]). and the protein-protein interaction 



network of yeast (1251 ). These real- world networks have hierarchical modularity, 



indicating the power- law clustering spectra; hence, C(k) ~ k a with a ~ 1 



(ll4j). In addition, we also consider the BA network, which does not have hi- 
erarchical modularity, for comparison. We summarize the networks size, the 
average degrees, and the exponents characterizing each network in Table HJ 

The exponents a from the real-world networks with hierarchical modularity 
are almost one ( fbl ) (see also Tabled]). Therefore, we expect that the forms 
of the degree distributions are invariant between the 3-clique and the original 
network because ( 3 ^ 3 — 1— 2 = 0. Figure [5] shows the degree distributions 
of n-clique networks embedded in the real-world networks. As expected, the 
forms of the degree distributions are invariant between the original and the 
3-clique network because of the proportional relationship between k and k^ 
(see the insets in Fig. [5]). 



In contrast, the exponent a from the BA network is equivalent to zero (12 ll ) 
(see also Table [T]) because of there is no hierarchical modularity. Therefore 
we predict that the power-law degree distribution from an original network is 
variant due to the extraction of the 3-clique network (because £3 = 3 — 2 = 
1). Figure [2] shows the degree distributions of n-clique networks embedded 
in BA networks. As expected, the form of the degree distribution is variant 
between the 3-clique network and the original network because of the nonlinear 
relationship between k and k^ 3 ' (Fig. [3]). 



6 Discussion and conclusion 



In this paper, we have provided theoretical predictions using the approxima- 
tion method for the degree distribution of a n-clique network and the shifts 
of the degree due to the extraction of the n-clique network. Moreover, we per- 
formed numerical simulations and show that the numerical results are in good 
agreement with our theoretical predictions, indicating that the approximation 
method is suitable. 
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Furthermore, we have found that the power-law degree distributions are iden- 
tical between the 3-clique and the original networks in the scale-free networks 
with hierarchical modularity using our theoretical predictions. We have only 
focused on the power-law degree distributions in this paper. However, because 
of the proportional relationship between k and k^ 3 \ the converse holds for 
the other power-law statistical properties which are observed in real-world 
networks: the hierarchical modularity (fli ) and the assortativity (liti ). 

We have confirmed that the power-law statistical properties are invariant be- 
tween the 3-clique networks and the original networks, although there is no 
space for the showing of the data. The invariance of the statistical properties 
implies that structural properties are identical between 3-clique and original 
networks. In addition, from these results, we expect that the 3-clique net- 
works are constructed by the same mechanisms as the original networks with 
hierarchical modularity. 

In contrast, we have found that the 3-clique network embedded in the BA 
network which does not have hierarchical modularity has different statistical 
properties from the original network. That is, the structural properties are 
different between 3-clique and original networks in the BA network. 

We believe that these results provide new insights into global structures of 
combined network motifs, community structures (0; I27I ) in social and bio- 
logical networks. In this paper, expressly, we found structural properties are 
identical between 3-clique networks and original networks. This lets us expect 
that 3-clique networks are constructed by the same design principles as the 
original networks with hierarchical modularity, and it implies that the clique 
networks help to understand design principles and global structures of com- 
bined significant subgraphs which reflect community and functional modules 
in networks. 



For example, it is believed that most real-world networks are constructed by 
the preferential attachment (1121 ; Il3l ). Because of a structural identity between 
3-clique networks and original networks, we expect that the clique networks 
are also constructed by the same preferential attachment as the original net- 
works. This mechanism suggests the preferential attachment of cliques (Il5l ). 
Actually, it is reported that there is a preferential attachment of community in 
social networks (1281 ). In biological networks, furthermore, cliques correspond 
to functional modules such as network motifs. In particular, 3-node clique, 
which denotes the network motifs such as the feedforward loop and so on, 
appears frequently. From our result, we expect that a network which consists 
of network motifs only is constructed by the same preferential attachment as 
an original network. If so, the motifs may concentrate on hubs. Actually, the 
concentration of motifs has been found by the network analysis 
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In this manner, we believe that we can find new structural properties and new 
insights into design principles of networks via an analysis of clique networks. 
And, our theoretical predictions may help the analysis and its interpretation. 
In biological networks, especially, since it is difficult to discuss network forma- 
tion processes because of no ancestral networks, we believe that the analysis 
help to understand design principles of networks. In addition, we may establish 
more realistic growing network models via the analysis. 
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